A nonparametric alternative to the Cochran-Armitage trend test in genetic case-control association studies: The Jonckheere-Terpstra trend test

Identifications of novel genetic signals conferring susceptibility to human complex diseases is pivotal to the disease diagnosis, prevention, and treatment. Genetic association study is a powerful tool to discover candidate genetic signals that contribute to diseases, through statistical tests for correlation between the disease status and genetic variations in study samples. In such studies with a case-control design, a standard practice is to perform the Cochran-Armitage (CA) trend test under an additive genetic model, which suffers from power loss when the model assumption is wrong. The Jonckheere-Terpstra (JT) trend test is an alternative method to evaluate association in a nonparametric way. This study compares the power of the JT trend test and the CA trend test in various scenarios, including different sample sizes (200–2000), minor allele frequencies (0.05–0.4), and underlying modes of inheritance (dominant genetic model to recessive genetic model). By simulation and real data analysis, it is shown that in general the JT trend test has higher, similar, and lower power than the CA trend test when the underlying mode of inheritance is dominant, additive, and recessive, respectively; when the sample size is small and the minor allele frequency is low, the JT trend test outperforms the CA trend test across the spectrum of genetic models. In sum, the JT trend test is a valuable alternative to the CA trend test under certain circumstances with higher statistical power, which could lead to better detection of genetic signals to human diseases and finer dissection of their genetic architecture.


Introduction
Over the past fifteen years, genome-wide association studies have significantly expanded the knowledge base for genetic factors in important healthcare outcomes [1]. Such  identified numerous genetic signals contributing to various complex human diseases, which can be very important to the diseases' diagnosis, prevention, and treatment. One commonly used approach to test association in a case-control genetic study is the Cochran-Armitage (CA) trend test [2,3] under the assumption of an additive genetic model [4,5], which can reach the optimal power when the underlying genetic model is also additive. However, it can suffer from power loss when the true genetic model is nonadditive (see, e.g., [6][7][8][9]). Power loss is one critical issue in genetic association studies. On the one hand, conducting a statistical test with reduced power may fail to detect true genetic signals, leading to false negative results. On the other hand, to achieve the same level of statistical power, the sample sizes needed will increase, leading to higher study expenses and resource requirements.
To test for associations between the disease status and genetic variation, one alternative approach to the CA trend test is the Jonckheere-Terpstra (JT) trend test [10,11], which is a rank-based nonparametric test. The JT trend test does not make assumptions on genetic models or data distribution, and thus has the potential to achieve better statistical power than the parametric CA trend test under certain circumstances. The potentially higher power from the JT trend test may result in novel genetic discoveries for complex human diseases, which can help researchers better understand the genetic etiology and eventually aid in the development of effective diagnosis, prevention, and treatment strategies of the diseases. Although the JT trend test offers an alternative to the CA trend test with potential advantages, their comparative performance has not been examined in the genetic literature. In this study, we aim to fill this research gap by comparing the power of the two tests in various conditions via simulations and real data analysis. The knowledge gained in this study can help guide the model selection between the CA and JT trend tests when conducting genetic case-control studies in practice.

Methods
Consider a diallelic locus with the major and minor alleles denoted as a and A, respectively, the genotype distribution in a case-control study can be summarized as in Table 1. Specifically, denote r i and s i as the number of cases and controls, respectively, for genotype G i , where i2 {0,1,2} reflects the number of A alleles a subject has. Thus G 0 , G 1 , and G 2 correspond to genotypes aa, Aa, and AA, respectively. Denote by R, S, and n i the marginal sums such that R ¼ P 2 i¼0 r i ; S ¼ P 2 i¼0 s i , and n i = r i +s i , and by N the total sample size such that N ¼ R þ S ¼ P 2 i¼0 n i . Assume (r 0 , r 1 , r 2 ) follow a trinomial distribution with parameters R and (τ 0 , τ 1 , τ 2 ), and (s 0 , s 1 , s 2 ) follow a trinomial distribution with parameters S and (υ 0 , υ 1 , υ 2 ). The null hypothesis of no association between the disease and genotype is then H 0 : τ i = υ i , for i2{0,1,2}. Equivalently, we can also assume r i 's are drawn from binomial distributions Bin(n i , π i ). The null hypothesis of no association between the disease and genotype is H 0 : π 0 = π 1 = π 2 . Assuming G 0 , G 1 , and G 2 are three ordered categories, a restricted alternative hypothesis for a trend test is H 1 : π 0 �π 1 �π 2 or π 0 �π 1 �π 2 with at least one strict inequality. To test H 1 , the CA trend test assigns a set of scores (x 0 , x 1 , x 2 ) to G 0 , G 1 , and G 2 , respectively, with the constraints x 0 �x 1 �x 2 and x 0 <x 2 , and examines whether there is a linear relationship between π i 's and x i 's by fitting a linear regression model. The test statistic is . Under H 0 , T CA follows a χ 2 distribution with 1 degree-of-freedom (d.f.). The choices of (x 0 , x 1 , x 2 ) represent assumptions on the genetic models. In practice, the additive model with (x 0 , x 1 , x 2 ) = (0,0.5,1) is usually assumed because of its robustness.
Hereinafter we denote it as T Add CA . Alternatively, the JT trend test compares the ranks of subjects based on their affection status between genotype groups to test H 1 . Consider the disease status of case and control as an ordinal variable Y, and denote by . For large N and n i 0 s not too small, T JT also follows a χ 2 distribution with 1 d.f. under H 0 .
For simplicity, hereinafter we use T JT and T Add CA to refer to both tests and test statistics.

Simulation
We conduct simulations to compare performance between T JT and T Add CA in terms of statistical power under various conditions. Define the penetrance function for each genotype as f i = P (affected|G i ), i = 0, 1, 2, and define genotype relative risk as λ i = f i /f 0 ; thus λ 0 = 1. The dominant, additive, and recessive genetic models can be specified by λ 1 = λ 2 , λ 1 = (1+λ 2 )/2, and λ 1 = 1, respectively. Note that the genetic model is defined in regard to the minor allele in this study. The model can be reparameterized by defining λ 1 = 1+λcosθ and λ 2 = 1+λsinθ, where λ�0 is the distance between point P = (λ 1 , λ 2 ) and point O = (1,1), which determines how far the genetic effect is from the null, and θ2[π/4, π/2] is the angle between OP and the horizontal line in a two-dimensional space, which determines the genetic model [12]. Therefore, the null hypothesis can be rewritten as H 0 : λ = 0. In terms of genetic models, θ = π/4, arctan 2, and π/2 correspond to dominant, additive, and recessive models, respectively. We performed simulations under the following alternative settings. Assume a disease prevalence (K) of 0.1 and the minor allele frequency (MAF) q2(0.05,0.1,0.2,0.3). Fix the alternative hypothesis as λ = 1 and vary the genetic models by setting θ' = θ/π from 1/4 to 1/2, i.e., from the dominant model to the recessive model, with an increment of 0.01. Under each genetic model, penetrances are determined by f 0 ¼ K=½ð1 À qÞ 2 þ 2l 1 qð1 À qÞ þ l 2 q 2 � and f i = λ i f 0 . The probabilities of the two trinomial distributions for cases and controls are then τ i = P(G i )f i /K and υ i = P(G i )(1−f i )/(1−K), respectively. Consider a balanced design, i.e., R = S with the total sample size N2(200, 500, 1000). At each setting 10,000 replicates are simulated, and each dataset is examined by both T JT and T Add CA . The empirical power is estimated as the proportion of the replicates for which the pvalue is less than or equal to 0.05. In an additional set of simulations for wider range of sample size and MAF, we considered N = 1500 and 2000 as well as q = 0.4 across the genetic models.
Because the sample sizes and MAF were large, the effect size in the alternative hypothesis was set to λ = 0.5 to make the maximum power less than 1 for the sake of comparison.
The simulation results across the main settings are presented in Fig 1 and the results for the additional set of simulations are presented in S1 Table. In all situations T JT is more powerful

PLOS ONE
than T Add CA when the underlying genetic model is dominant. The power advantage of T JT diminishes as the genetic model evolves toward the additive model. In most situations except for small sample sizes and low MAFs, the two tests have approximately equivalent power when the underlying model is additive. T JT becomes less powerful than T Add CA and the disadvantage enlarges when the genetic model keeps evolving toward the recessive end. In case of low MAFs and small sample sizes, e.g., N = 200 & q�0.1 or N�1000 & q = 0.05, T JT is more powerful than T Add CA . To verify the above findings, for each simulation setting we construct a table with the value in each cell equal to the expected value under the trinomial distributions for cases and controls, respectively, i.e., E(r i ) = τ i N/2 and E(s i ) = υ i N/2, i2{0,1,2}. Specifically, in each simulation setting, using the fixed combination of sample size (N), MAF (q), and genetic model (θ), the cell probabilities of each genotype for cases and controls in the genotype distribution  Table 2. Consistent with the simulation results, ΔT>0 when the genetic model was dominant; |ΔT|<2% when the genetic model was additive, and ΔT<0 when the genetic model was recessive. The only discrepancy is that in case of low MAFs and small sample sizes, theoretically ΔT would be less than zero under the recessive model, but empirically it is greater than zero. We suspect it is because the parametric assumptions and asymptotic theory behind T Add CA do not hold in these circumstances, whereas T JT does not impose assumptions on data distribution.

Real data analysis
We further compared T JT and T Add CA in real data, which confirmed the simulation results. The first example is on the association between the variant rs2398162 and hypertension in the

PLOS ONE
Wellcome Trust Case Control Consortium study [13]. There were 1940 cases (r 0 = 1205, r 1 = 624, r 2 = 111) and 2923 controls (s 0 = 1608, s 1 = 1121, s 2 = 194), and it was suggested the minor allele have a dominant effect. Applying the two trend tests on the dataset, we can obtain T JT = 22.82 (p-value = 1.8×10 −6 ) and T Add CA ¼ 19:97 (p-value = 7.9×10 −6 ). These results were in line with the observations in the simulation that T JT is more powerful than T Add CA when the underlying genetic model is dominant.
Additional real data analyses came from case-control studies for falciparum malaria [14], age-related macular degeneration (AMD) [15] and hypertension with additional variants, with the genotype counts all extracted from [9]. In the falciparum malaria study, variant rs10900589 in ATP2B4 was associated with the disease in the Ghanaian samples. This association was also evaluated in the Gambian samples and it was significant under a recessive model but insignificant under dominant and additive models. In the AMD study, variants rs380390 and rs10131337 in CFH were associated with AMD. We examined the associations of the three variants with the diseases using both tests. Cell counts, test statistics and p-values are reported in and T JT < T Add CA , which were consistent with the simulation results that T JT is less powerful than T Add CA when the underlying genetic model is recessive. For both rs380390 and rs10131337, the minor allele approximately acts in an additive mode r 1 n 1 � r 0 =n 0 þr 2 =n 2 2 � � and T JT � T Add CA , which were consistent with the simulation results that the two tests have comparable power when the underlying model is additive. In the hypertension study, we compared the two tests on three SNPs that showed genome-wide significance. The results are reported in S2 Table. The conclusion still holds in this real data analysis: T JT and T Add CA had similar power when the genetic model was close to be additive (rs7961152, rs1937506, rs6997709).
To assess the potential genotyping errors among the variants considered in the real data analysis, we tested the Hardy-Weinberg equilibrium (HWE) among the cases and controls, separately, for each variant [16,17]. An exact test for HWE was conducted using the R package HardyWeinberg [18], and the results with exact p-values for the variants were reported in S3 Table. Results showed that the p-values of the HWE tests for all the variants were larger than 0.01, with only two between 0.01 and 0.05, suggesting that there was little evidence of genotyping error among the variants. Moreover, we conducted allelic test to evaluate the association for the variants. The allelic test assesses the genetic association at the allele level by collapsing the genotypes into the counts for reference and alternative alleles, between cases and controls, however, this approach is not robust against the HWE departures [4]. The test statistics and pvalues of allelic test results were summarized in S3 Table. Of note, the results were close to those of T Add CA , suggesting that the assumptions of HWE were not violated.

Discussion
Our previous work elucidates the mechanism of the CA trend test that it examines the location shift of genotype scores between the case and control groups [19] by measuring the goodness of fit of a linear regression model correlating proportions of cases in genotype groups with their respective scores [20]. The preassigned scores reflect assumptions on the genetic model. In contrast, the JT trend test examines the location shift of phenotype scores among genotype groups in a rank-based nonparametric way without making assumptions on genetic models or data distribution. The power difference between the two tests in different situations shown in this study can be explained by their properties. When the underlying model is dominant, T Add CA suffers power loss because of the wrong model assumption that it is inferior to T JT ; when the underlying model is recessive, the limited information on location shift hampers T JT more than the wrong model assumption hampers T Add CA ; in case of low MAFs and small sample sizes where the large-sample theory breaks, T JT outperforms T Add CA because it does not impose assumptions on data distribution as the latter does.
In sum, in this study we compared the power of T Add CA and T JT under different situations. By simulation and real data examples, we show T JT can provide a valuable alternative to T Add CA in case of small sample sizes and low MAFs; when the genetic mechanism is known to be dominant, or that is the only model of interest, T JT is preferred. However, in a moderate to large sample size study with the true mode of inheritance unknown, the use of the JT trend test is  Table 1. https://doi.org/10.1371/journal.pone.0280809.g002

PLOS ONE
not recommended compared to the CA trend test under an additive model, which is more robust under a wide range of modes of inheritance.
Supporting information S1